{
  "cells": [
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "pGUYKbJNWNgj"
      },
      "source": [
        "##### Copyright 2020 The TensorFlow Authors."
      ]
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "metadata": {
        "cellView": "form",
        "id": "1PzPJglSWgnW"
      },
      "outputs": [],
      "source": [
        "#@title Licensed under the Apache License, Version 2.0\n",
        "# you may not use this file except in compliance with the License.\n",
        "# You may obtain a copy of the License at\n",
        "#\n",
        "# https://www.apache.org/licenses/LICENSE-2.0\n",
        "#\n",
        "# Unless required by applicable law or agreed to in writing, software\n",
        "# distributed under the License is distributed on an \"AS IS\" BASIS,\n",
        "# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.\n",
        "# See the License for the specific language governing permissions and\n",
        "# limitations under the License."
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "b5P4BEg1XYd5"
      },
      "source": [
        "# TensorFlow Addons Optimizers: ConditionalGradient\n",
        "\n",
        "\n",
        "<table class=\"tfo-notebook-buttons\" align=\"left\">\n",
        "  <td>\n",
        "    <a target=\"_blank\" href=\"https://www.tensorflow.org/addons/tutorials/optimizers_conditionalgradient\"><img src=\"https://www.tensorflow.org/images/tf_logo_32px.png\" />View on TensorFlow.org</a>\n",
        "  </td>\n",
        "  <td>\n",
        "    <a target=\"_blank\" href=\"https://colab.research.google.com/github/tensorflow/addons/blob/master/docs/tutorials/optimizers_conditionalgradient.ipynb\"><img src=\"https://www.tensorflow.org/images/colab_logo_32px.png\" />Run in Google Colab</a>\n",
        "  </td>\n",
        "  <td>\n",
        "    <a target=\"_blank\" href=\"https://github.com/tensorflow/addons/blob/master/docs/tutorials/optimizers_conditionalgradient.ipynb\"><img src=\"https://www.tensorflow.org/images/GitHub-Mark-32px.png\" />View source on GitHub</a>\n",
        "  </td>\n",
        "  <td>\n",
        "    <a href=\"https://storage.googleapis.com/tensorflow_docs/addons/docs/tutorials/optimizers_conditionalgradient.ipynb\"><img src=\"https://www.tensorflow.org/images/download_logo_32px.png\" />Download notebook</a>\n",
        "  </td>\n",
        "</table>"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "Faj8luWnYNSG"
      },
      "source": [
        "# Overview\n",
        "This notebook will demonstrate how to use the Conditional Graident Optimizer from the Addons package."
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "MrDjqjY6YRYM"
      },
      "source": [
        "# ConditionalGradient\n",
        "\n",
        "\n",
        "> Constraining the parameters of a neural network has been shown to be beneficial in training because of the underlying regularization effects.  Often, parameters are constrained via a soft penalty (which never guarantees the constraint satisfaction) or via a projection operation (which is computationally expensive). Conditional gradient (CG) optimizer, on the other hand, enforces the constraints strictly without the need for an expensive projection step. It works by minimizing a linear approximation of the objective within the constraint set. In this notebook, you demonstrate the appliction of Frobenius norm constraint via the CG optimizer on the MNIST dataset. CG is now available as a tensorflow API. More details of the optimizer are available at https://arxiv.org/pdf/1803.06453.pdf\n"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "dooBaYGLYYnn"
      },
      "source": [
        "## Setup"
      ]
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "metadata": {
        "id": "2sCyoNXlgGbk"
      },
      "outputs": [],
      "source": [
        "!pip install -U tensorflow-addons"
      ]
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "metadata": {
        "id": "qYo0FkL4O7io"
      },
      "outputs": [],
      "source": [
        "import tensorflow as tf\n",
        "import tensorflow_addons as tfa\n",
        "from matplotlib import pyplot as plt"
      ]
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "metadata": {
        "id": "kR0PnjrIirpJ"
      },
      "outputs": [],
      "source": [
        "# Hyperparameters\n",
        "batch_size=64\n",
        "epochs=10"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "-x0WBp-IYz7x"
      },
      "source": [
        "# Build the Model"
      ]
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "metadata": {
        "id": "4KzMDUT0i1QE"
      },
      "outputs": [],
      "source": [
        "model_1 = tf.keras.Sequential([\n",
        "    tf.keras.layers.Dense(64, input_shape=(784,), activation='relu', name='dense_1'),\n",
        "    tf.keras.layers.Dense(64, activation='relu', name='dense_2'),\n",
        "    tf.keras.layers.Dense(10, activation='softmax', name='predictions'),\n",
        "])"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "XGADNG3-Y7aa"
      },
      "source": [
        "# Prep the Data"
      ]
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "metadata": {
        "id": "d6a-kbM_i1b2"
      },
      "outputs": [],
      "source": [
        "# Load MNIST dataset as NumPy arrays\n",
        "dataset = {}\n",
        "num_validation = 10000\n",
        "(x_train, y_train), (x_test, y_test) = tf.keras.datasets.mnist.load_data()\n",
        "\n",
        "# Preprocess the data\n",
        "x_train = x_train.reshape(-1, 784).astype('float32') / 255\n",
        "x_test = x_test.reshape(-1, 784).astype('float32') / 255"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "sOlB-WqjZp1Y"
      },
      "source": [
        "# Define a Custom Callback Function"
      ]
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "metadata": {
        "id": "8LCmRXUgZqyV"
      },
      "outputs": [],
      "source": [
        "def frobenius_norm(m):\n",
        "    \"\"\"This function is to calculate the frobenius norm of the matrix of all\n",
        "    layer's weight.\n",
        "  \n",
        "    Args:\n",
        "        m: is a list of weights param for each layers.\n",
        "    \"\"\"\n",
        "    total_reduce_sum = 0\n",
        "    for i in range(len(m)):\n",
        "        total_reduce_sum = total_reduce_sum + tf.math.reduce_sum(m[i]**2)\n",
        "    norm = total_reduce_sum**0.5\n",
        "    return norm"
      ]
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "metadata": {
        "id": "udSvzKm4Z5Zr"
      },
      "outputs": [],
      "source": [
        "CG_frobenius_norm_of_weight = []\n",
        "CG_get_weight_norm = tf.keras.callbacks.LambdaCallback(\n",
        "    on_epoch_end=lambda batch, logs: CG_frobenius_norm_of_weight.append(\n",
        "        frobenius_norm(model_1.trainable_weights).numpy()))"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "qfhE1DfwZC1i"
      },
      "source": [
        "# Train and Evaluate: Using CG as Optimizer\n",
        "\n",
        "Simply replace typical keras optimizers with the new tfa optimizer "
      ]
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "metadata": {
        "id": "6-AMaOYEi1kK"
      },
      "outputs": [],
      "source": [
        "# Compile the model\n",
        "model_1.compile(\n",
        "    optimizer=tfa.optimizers.ConditionalGradient(\n",
        "        learning_rate=0.99949, lambda_=203),  # Utilize TFA optimizer\n",
        "    loss=tf.keras.losses.SparseCategoricalCrossentropy(),\n",
        "    metrics=['accuracy'])\n",
        "\n",
        "history_cg = model_1.fit(\n",
        "    x_train,\n",
        "    y_train,\n",
        "    batch_size=batch_size,\n",
        "    validation_data=(x_test, y_test),\n",
        "    epochs=epochs,\n",
        "    callbacks=[CG_get_weight_norm])"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "8OJp4So9bYYR"
      },
      "source": [
        "# Train and Evaluate: Using SGD as Optimizer"
      ]
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "metadata": {
        "id": "SuizUueqn449"
      },
      "outputs": [],
      "source": [
        "model_2 = tf.keras.Sequential([\n",
        "    tf.keras.layers.Dense(64, input_shape=(784,), activation='relu', name='dense_1'),\n",
        "    tf.keras.layers.Dense(64, activation='relu', name='dense_2'),\n",
        "    tf.keras.layers.Dense(10, activation='softmax', name='predictions'),\n",
        "])"
      ]
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "metadata": {
        "id": "V8QC3xCwbfNl"
      },
      "outputs": [],
      "source": [
        "SGD_frobenius_norm_of_weight = []\n",
        "SGD_get_weight_norm = tf.keras.callbacks.LambdaCallback(\n",
        "    on_epoch_end=lambda batch, logs: SGD_frobenius_norm_of_weight.append(\n",
        "        frobenius_norm(model_2.trainable_weights).numpy()))"
      ]
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "metadata": {
        "id": "9BNi4yXGcDlg"
      },
      "outputs": [],
      "source": [
        "# Compile the model\n",
        "model_2.compile(\n",
        "    optimizer=tf.keras.optimizers.SGD(0.01),  # Utilize SGD optimizer\n",
        "    loss=tf.keras.losses.SparseCategoricalCrossentropy(),\n",
        "    metrics=['accuracy'])\n",
        "\n",
        "history_sgd = model_2.fit(\n",
        "    x_train,\n",
        "    y_train,\n",
        "    batch_size=batch_size,\n",
        "    validation_data=(x_test, y_test),\n",
        "    epochs=epochs,\n",
        "    callbacks=[SGD_get_weight_norm])"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "1Myw0FVcd_Z9"
      },
      "source": [
        "# Frobenius Norm of Weights: CG vs SGD"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "0tJYQBRt-ZUl"
      },
      "source": [
        "The current implementation of CG optimizer is based on Frobenius Norm, with considering Frobenius Norm as regularizer in the target function. Therefore, you compare CG’s regularized effect with SGD optimizer, which has not imposed Frobenius Norm regularizer."
      ]
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "metadata": {
        "id": "Ewf17MW1cJVI"
      },
      "outputs": [],
      "source": [
        "plt.plot(\n",
        "    CG_frobenius_norm_of_weight,\n",
        "    color='r',\n",
        "    label='CG_frobenius_norm_of_weights')\n",
        "plt.plot(\n",
        "    SGD_frobenius_norm_of_weight,\n",
        "    color='b',\n",
        "    label='SGD_frobenius_norm_of_weights')\n",
        "plt.xlabel('Epoch')\n",
        "plt.ylabel('Frobenius norm of weights')\n",
        "plt.legend(loc=1)"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "JGtutiXuoZyx"
      },
      "source": [
        "# Train and Validation Accuracy: CG vs SGD\n"
      ]
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "metadata": {
        "id": "s-SNIr10o2va"
      },
      "outputs": [],
      "source": [
        "plt.plot(history_cg.history['accuracy'], color='r', label='CG_train')\n",
        "plt.plot(history_cg.history['val_accuracy'], color='g', label='CG_test')\n",
        "plt.plot(history_sgd.history['accuracy'], color='pink', label='SGD_train')\n",
        "plt.plot(history_sgd.history['val_accuracy'], color='b', label='SGD_test')\n",
        "plt.xlabel('Epoch')\n",
        "plt.ylabel('Accuracy')\n",
        "plt.legend(loc=4)"
      ]
    }
  ],
  "metadata": {
    "accelerator": "GPU",
    "colab": {
      "collapsed_sections": [],
      "name": "optimizers_conditionalgradient.ipynb",
      "toc_visible": true
    },
    "kernelspec": {
      "display_name": "Python 3",
      "name": "python3"
    }
  },
  "nbformat": 4,
  "nbformat_minor": 0
}
